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Claims 

What is claimed is: 

1 . A method for use in providing improved fault tolerance in a computing system comprising 
at least one computing machine, the method comprising the steps of: 
5 executing a control program in conjunction with a fault tolerance software system 

running on the at least one computing machine; and 

initiating via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 
to the one or more requests, and generates at least one return value utilizable by the control program 
10 to indicate a failure condition in the monitored program. 

: r 2. The method of claim 1 wherein the computing system is configured in accordance with 

fU a client-server architecture and the at least one computing machine comprises a server of the 
% h computing system. 

j| 

3 . The method of claim 1 wherein the control program comprises a control thread of a failure 
^ detection process associated with a failure detection component of the fault tolerance software 
^; system. 

•\ Til 

io 4. The method of claim 1 wherein the control program comprises a thread of a failure 

detection process and the test script program comprises a process separate from the failure detection 
process. 

5. The method of claim 1 wherein the control program comprises a thread of a failure 
25 detection process and the test script program comprises a thread of the same failure detection 

process. 

6. The method of claim 1 wherein the test script program is implemented in an object- 
oriented programming language such that one or more components of the test script program 
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comprise a base class from which one or more other components of the test script program are 
generatable for use with the monitored program. 

7. The method of claim 6 wherein the object-oriented programming language comprises the 
5 Java programming language. 

8. The method of claim 6 wherein the one or more components comprising the base class 
comprise one or more of an initialization component, an obtain requests component, and a request 
interruption component. 

10 

p 9. The method of claim 8 wherein the one or more other components generatable from the 

base class comprise a request issuance component and a response verification component, both 
^ particular to the monitored program. 

10. The method of claim 9 wherein for each of the requests specified in the obtain requests 
component, that component creates corresponding ones of the request issuance component and the 

IB request interruption component, and wherein the request interruption component terminates the 
J"; corresponding request if a response from the monitored program is not received within a designated 
O period of time. 

io 

1 1 . The method of claim 1 wherein the control program initiates a persistent program, and 
the persistent program periodically initiates the test script program. 

12. The method of claim 11 wherein the persistent program comprises at least one of a 
25 thread and a process. 

13. The method of claim 1 1 wherein the persistent program receives the return value from 
the test script program and delivers it to the control program. 
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14. The method of claim 1 wherein the test script program comprises an interpreted script. 

15. The method of claim 1 wherein the test script program comprises a native executable. 

16. The method of claim 1 wherein the test script program comprises byte code. 

17. An apparatus for use in providing improved fault tolerance in a computing system, the 
apparatus comprising: 

at least one computing machine having a processor and a memory, the processor 
being operatively coupled to the memory, wherein the processor is operative: (i) to execute a control 
program in conjunction with a fault tolerance software system running on the at least one computing 
machine, and (ii) to initiate via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 
to the one or more requests, and generates at least one return value utilizable by the control program 
to indicate a failure condition in the monitored program. 

18. A storage medium for storing program code for use in providing improved fault 
tolerance in a computing system comprising at least one computing machine, wherein the program 
code when executed on the at least one computing machine performs the steps of: 

executing a control program in conjunction with a fault tolerance software system 
running on the at least one computing machine; and 

initiating via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 
to the one or more requests, and generates at least one return value utilizable by the control program 
to indicate a failure condition in the monitored program. 
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